33 research outputs found
Face tracking and pose estimation with automatic three-dimensional model construction
A method for robustly tracking and estimating the face pose of a person using stereo vision is presented. The method is invariant to identity and does not require previous training. A face model is automatically initialised and constructed online: a fixed point distribution is superposed over the face when it is frontal to the cameras, and several appropriate points close to those locations are chosen for tracking. Using the stereo correspondence of the cameras, the three-dimensional (3D) coordinates of these points are extracted, and the 3D model is created. The 2D projections of the model points are tracked separately on the left and right images using SMAT. RANSAC and POSIT are used for 3D pose estimation. Head rotations up to ±45° are correctly estimated. The approach runs in real time. The purpose of this method is to serve as the basis of a driver monitoring system, and has been tested on sequences recorded in a moving car.Ministerio de Educación y CienciaComunidad de Madri
Perception advances in outdoor vehicle detection for automatic cruise control
This paper describes a vehicle detection system based
on support vector machine (SVM) and monocular vision.
The final goal is to provide vehicle-to-vehicle time gap
for automatic cruise control (ACC) applications in the
framework of intelligent transportation systems (ITS). The
challenge is to use a single camera as input, in order to
achieve a low cost final system that meets the requirements
needed to undertake serial production in automotive industry.
The basic feature of the detected objects are first located in
the image using vision and then combined with a SVMbased classifier. An intelligent learning approach is proposed
in order to better deal with objects variability, illumination
conditions, partial occlusions and rotations. A large database
containing thousands of object examples extracted from real
road scenes has been created for learning purposes. The
classifier is trained using SVM in order to be able to classify
vehicles, including trucks. In addition, the vehicle detection
system described in this paper provides early detection of
passing cars and assigns lane to target vehicles. In the paper,
we present and discuss the results achieved up to date in real
traffic conditions.Ministerio de Educación y Cienci
Integrating state-of-the-art CNNs for multi-sensor 3D vehicle detection in real autonomous driving environments
2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 27-30 Oct. 2019This paper presents two new approaches to detect
surrounding vehicles in 3D urban driving scenes and their corresponding Bird’s Eye View (BEV). The proposals integrate two
state-of-the-art Convolutional Neural Networks (CNNs), such as
YOLOv3 and Mask-RCNN, in a framework presented by the
authors in [1] for 3D vehicles detection fusing semantic image
segmentation and LIDAR point cloud. Our proposals take
advantage of multimodal fusion, geometrical constrains, and
pre-trained modules inside our framework. The methods have
been tested using the KITTI object detection benchmark and
comparison is presented. Experiments show new approaches
improve results with respect to the baseline and are on par
with other competitive state-of-the-art proposals, being the only
ones that do not apply an end-to-end learning process. In this
way, they remove the need to train on a specific dataset and
show a good capability of generalization to any domain, a
key point for self-driving systems. Finally, we have tested our
best proposal in KITTI in our driving environment, without
any adaptation, obtaining results suitable for our autonomous
driving application.Ministerio de Economía y CompetitividadComunidad de Madri
Real-Time Bird's Eye View Multi-Object Tracking system based on Fast Encoders for object detection
2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), September 20-23, 2020, Rhodes, Greece. Virtual Conference.This paper presents a Real-Time Bird’s Eye View
Multi Object Tracking (MOT) system pipeline for an Autonomous Electric car, based on Fast Encoders for object
detection and a combination of Hungarian algorithm and
Bird’s Eye View (BEV) Kalman Filter, respectively used for
data association and state estimation. The system is able to
analyze 360 degrees around the ego-vehicle as well as estimate
the future trajectories of the environment objects, being the
essential input for other layers of a self-driving architecture,
such as the control or decision-making. First, our system
pipeline is described, merging the concepts of online and realtime DATMO (Deteccion and Tracking of Multiple Objects),
ROS (Robot Operating System) and Docker to enhance the
integration of the proposed MOT system in fully-autonomous
driving architectures. Second, the system pipeline is validated
using the recently proposed KITTI-3DMOT evaluation tool that
demonstrates the full strength of 3D localization and tracking
of a MOT system. Finally, a comparison of our proposal with
other state-of-the-art approaches is carried out in terms of
performance by using the mainstream metrics used on MOT
benchmarks and the recently proposed integral MOT metrics,
evaluating the performance of the tracking system over all
detection thresholds.Ministerio de Ciencia, Innovación y UniversidadesComunidad de Madri
Error Analysis in a Stereo Vision-Based Pedestrian Detection Sensor for Collision Avoidance Applications
This paper presents an analytical study of the depth estimation error of a stereo vision-based pedestrian detection sensor for automotive applications such as pedestrian collision avoidance and/or mitigation. The sensor comprises two synchronized and calibrated low-cost cameras. Pedestrians are detected by combining a 3D clustering method with Support Vector Machine-based (SVM) classification. The influence of the sensor parameters in the stereo quantization errors is analyzed in detail providing a point of reference for choosing the sensor setup according to the application requirements. The sensor is then validated in real experiments. Collision avoidance maneuvers by steering are carried out by manual driving. A real time kinematic differential global positioning system (RTK-DGPS) is used to provide ground truth data corresponding to both the pedestrian and the host vehicle locations. The performed field test provided encouraging results and proved the validity of the proposed sensor for being used in the automotive sector towards applications such as autonomous pedestrian collision avoidance
Simulación de vehículos autónomos usando V-Rep bajo Ros
[Resumen] En este artículo se presentan las principales características del entorno de simulación que se está utilizando para el desarrollo de diferentes algoritmos de conducción autónoma. Estos desarrollos forman parte de un proyecto de conducción autónoma de vehículo en el marco del Plan Nacional de Investigación denominado SmartElderlyCar y desarrollado por la Universidad de Alcalá (UAH) y la Universidad de Vigo (UVIGO). Se ha realizado de forma exitosa la simulación de un vehículo comercial en V-REP controlado mediante nodos desarrollados bajo el sistema ROS en el campus externo de la UAH y se ha logrado conducir por sus carriles siguiendo la línea central mediante un algoritmo de seguimiento de trayectoria.Ministerio de Economía y Competitividad; TRA2015-70501-C2-1-RMinisterio de Economía y Competitividad; TRA2015-70501-C2-2-
A text reading algorithm for both natural and born-digital images
Reading text in natural images has focused again the attention of many researchers during the last few years due to the increasing availability of cheap image-capturing devices in low-cost products like mobile phones. Therefore, as text can be found on any environment, the applicability of text-reading systems is really extensive. For this purpose, we present in this paper a robust method to read text in natural images. It is composed of two main separated stages. Firstly, text is located in the image using a set of simple and fast-to-compute features highly discriminative between character and non-character objects. They are based on geometric and gradient properties. The second part of the system carries out the recognition of the previously detected text. It uses gradient features to recognize single characters and Dynamic Programming (DP) to correct misspelled words. Experimental results obtained with different challenging datasets show that the proposed system exceeds state-of-the-art performance, both in terms of localization and recognition.Ministerio de Economía y CompetitividadComunidad de Madri